Tiling Arrays are a subtype of microarray chips. Like traditional microarrays, they function by hybridizing labeled DNA or RNA target molecules to probes fixed onto a solid surface. Tiling arrays differ from traditional microarrays in the nature of the probes. Instead of probing for sequences of known or predicted genes which may be dispersed throughout the genome, tiling arrays probe intensively for sequences which are known to exist in a contiguous region of genome. This is useful for characterizing regions of genome which are sequenced but with local functions that are largely unknown. Tiling arrays aid in transcriptome mapping as well as in discovering sites of DNA- protein interaction (ChIP-chip), of DNA methylation (MeDIP-chip), and of sensitivity to DNase (DNase Chip), in addition to other uses (e.g. array CGH).[1] In addition to the advantage of being able to detect previously unidentified genes and regulatory sequences, improved quantification of transcription products is possible. Specific probes are present in millions of copies (as opposed to only several as in traditional arrays) within an array unit called a feature, with anywhere from 10,000 to more than 6,000,000 different features per array.[2] Variable levels of mapping resolution are obtainable by adjusting the amount of sequence overlap between probes, or the amount of known base pairs between probe sequences, as well as the length of the probes themselves. For smaller genomes such as that of Arabidopsis, whole genomes can be examined.[3] Tiling arrays are a useful tool in genome-wide association studies.
Contents |
There are two main ways of synthesizing tiling arrays; the first is photolithographic manufacturing and the second is mechanically spotting or printing. The first method involves in situ synthesis where probes, approximately 25bp, are built on the surface of the chip. These arrays can hold up to 6 million discrete features, each of which contains millions of copies of one probe. The other way of synthesizing tiling array chips is via mechanically printing probes onto the chip. This is done by using automated machines with pins that place the previously synthesized probes onto the surface. Due to the size restriction of the pins, these chips can hold up to nearly 400,000 features.[4] Three manufacturers of tiling arrays are Affymetrix, NimbleGen and Agilent. Their products vary in probe length and spacing.
ChIP-chip is one of the most popular usages of tiling arrays. Chromatin immunoprecipitation is the technique whereby binding sites of proteins can be identified. A genome-wide variation of this is known as ChIP-on-chip. Proteins that bind to chromatin are cross-linked in vivo, usually via fixation with formaldehyde. The chromatin is then fragmented and exposed to antibodies specific to protein of interest. These complexes are then precipitated. The DNA is then isolated and purified. With traditional DNA microarrays, the immunoprecipitated DNA is hybridized to the chip, which contains probes, designed to cover regions representative of the genome. However, with tiling arrays, overlapping probes or probes in very close proximity can be used. This gives an unbiased analysis with high resolution. Besides these advantages, tiling arrays show high reproducibility, and with overlapping probes spanning large segments of the genome, tiling arrays can still interrogate protein binding sites, which harbor repeats. ChIP-chip experiments have been done to identify binding sites of transcription factors across the genome in yeast, drosophila and a few mammalian species.[5]
Another popular use of tiling arrays is in finding expressed genes. Traditional methods of gene prediction for annotation of genomic sequences have had several problems when used to map the transcriptome, such as not producing an accurate structure of the genes and also the missing of transcripts. The method of sequencing cDNA to find transcribed genes also runs into problems, like not being able to detect rare RNA molecules, RNA that are not polyadenylated, and very short RNA, and so would not detect genes that are only active in response to signals or specific to a time frame. Tiling arrays can solve these issues in mapping the transcriptome. Due to the high resolution and sensitivity, even small and rare molecules can be detected. The overlapping nature of the probes also allows detection of non-polyadenylated RNA and can produce a more precise picture of gene structure.[6] Earlier studies done on chromosome 21 and 22 showed the power of tiling arrays for identifying transcription units.[7][8][9] The authors used 25-mer probes that were 35bp apart, spanning the entire chromosomes. Labeled targets were made from polyadenylated RNA. They found many more transcripts than predicted and 90% were outside of annotated exons. Another study done in Arabidopsis used high-density oligonucleotide arrays that cover the entire genome. More than 10 times more transcripts were found than predicted by ESTs and other prediction tools.[3][10] Also found were novel transcripts in the centromeric regions where it was thought that no genes are actively expressed. Many noncoding and natural antisense RNA have also been identified using tiling arrays.[9]
Methyl-DNA immunoprecipitation followed by tiling array allows DNA methylation mapping and measurement across the genome. DNA is methylated on cytosine in CG di-nucleotides in many places in the genome. This modification is one of the best-understood inherited epigenetic changes and is shown to affect gene expression. Mapping these sites can add to the knowledge of expressed genes and also epigenetic regulation on a genome-wide level. Studies have been done, utilizing tiling arrays, to generate high-resolution methylation maps for the Arabidopsis genome to generate the first “methylome”.
DNase chip is an application of tiling arrays to identify hypersensitive sites, which are segments of open chromatin that are more readily cleaved by DNaseI. DNaseI cleaving produces larger fragments of around 1.2kb in size. These hypersensitive sites have been shown to be an accurate way of predicting regulatory elements such as promoter regions, enhancers and silencers.[11] Historically, the method uses Southern blotting to find digested fragments; however, tiling arrays have been used in its place for applying the technique to a genome-wide scale.
Array-based CGH is a technique often used in diagnostics to compare differences between types of DNA, such as normal cells vs. cancer cells. There are two types of tiling arrays commonly used for array CGH, which are the whole genome and fine tiled. The whole genome approach would be useful in identifying copy number variations with high resolution. On the other hand, the fine tiled array CGH would produce ultrahigh resolution to find other abnormalities such as breakpoints.[12]
There are several different methods for conducting a tiling arraying. One protocol for analyzing gene expression involves first isolating total RNA. This is then purified of rRNA molecules. The RNA is copied into double stranded DNA, which is subsequently amplified and in vitro transcribed to cRNA. The product is split into triplicates to produce dsDNA, which is then fragmented and labeled. Finally, the samples are hybridized to the tiling array chip. The signals from the chip are scanned and interpreted by computers.
Various software and algorithms are available for data analysis and vary in benefits depending on the manufacturer of the array chip. For Affymetrix chips, the model-based analysis of tiling array (MAT) is the most effective peak-seeking algorithm. For NimbleGen chips, TAMAL is more suitable for locating binding sites. Alternative algorithms include MA2C and TileScope, which are less complicated to operate. The Joint binding deconvolution algorithm is commonly used for the Agilent chips. If sequence analysis of binding site or annotation of the genome is required then programs like MEME, Gibbs Motif Sampler, Cis-regulatory element annotation system and Galaxy are used.[4]
The obvious advantages of tiling array are that they provide an unbiased tool to investigate protein binding, gene expression and gene structure on a genome-wide scope. They allow a new level of insight in studying the transcriptome and methylome.
However, there are also certain drawbacks. First and foremost is the issue of expense. Although the cost of purchasing tiling array kits have reduced in price in the last several years, at the moment, the price makes it impractical to actually use genome wide tiling arrays for larger genomes like mammalian. Another issue is to do with the ultra sensitive detection of this technology. For looking at gene expression, an argument against a study in Arabidopsis sp., which found ten times more genes in the genome than traditional prediction tools, is that the results were confounded by “transcriptional noise”.[2] Furthermore there is an analysis challenge, as there is no clearly defined start or stop to your region of interest once identified on the array. Also, the arrays usually only give chromosome and position numbers, necessitating the need to sequence your region of interest (if required) as a separate step (although some modern arrays also give sequence information ).